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(57) Abstract 

Methods are provided for modification of genomic target sites, where desired changes which can be small or subtle may be 
introduced into the target site to provide for modification of target genes or regulatory sequences. It is found that one may retain 
a marker without interference with the functioning of a target gene or select for excision of exogenous DNA to leave a single copy 
of the target gene with the modification. 
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GENOMIC MODIFICATIONS WITH HOMOLOGOUS DNA TARGETING 

ACKNOWLEDGEMENTS 
This invention was supported in part by grants 
from NIH. The U.S. Government may have rights in this 
5 invention. 

INTRODUCTION 

Technical Field 

The field of this invention is genomic 
modification using homologous DNA for targeting. 

10 Background 

There are a significant number of opportunities 
for introducing genetic modifications in vivo for purposes 
of correcting genetic defects and treating genetic 
disorders. Among genetic disorders involving hematopoietic 

15 cells are such defects as sickle cell anemia, /3-thalassemia, 
various hemoglobinopathies, and disorders of erythrocyte 
metabolism including hereditary spherocytosis, pyruvate 
kinase deficiency, G6PD deficiency, etc. Among genetic 
disorders involving circulating plasma proteins and enzymes 

20 are inherited disorders of the complement system such as 
hereditary angioneurotic edema, agammaglobulinemia 
syndromes, ai-antitrypsin deficiency and disorders of 
hemostasis, such as hemophilia A, hemophilia B and Von 
Willebrand's disease. Among genetic disorders involving 

25 connective tissue, bone and muscle are the muscular 
dystrophies, the mucopolysaccharidosis syndromes, 
amyloidosis and various disorders of calcium and phosphate 
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metabolisms including hypophosphatasia, rickets and pseudo- 
hypoparathyroidism. Among the genetic disorders of 
metabolism are inherited disorders of amino acid metabolism, 
such as phenylketonuria, homocystinuria, albinism and 
5 tyrosinosis; inherited disorders of carbohydrate metabolism 
such as the glycogen storage diseases, diabetic syndromes 
and galactosemia; inherited disorders of lipid metabolism 
such as the hyperlipoproteinemias, the hyperlipidemias, the 
lipoprotein deficiency syndromes, the gangliosidoses^ 
10 including Tay-Sach^s disease, the lipidoses including 
Fabry's disease, Gaucher 's disease, Refsum*s disease, emd 
Neimann-Pick disease; inherited disorders of steroid 
metabolism, such as adrenal hyperplasia; inherited disorders 
of purine and pyrimidine metabolism such as gout, Lesch- 
15 Nyhan Syndrome , and xanthinuria ; and other metabolic 
disorders such as Wilson's disease, the porphyria syndromes, 
and hemochromatosis. Among the genetic disorders involving 
membrane transport of substances in the kidney, lung and 
other organs are the malabsorption syndromes, cystinuria, 
20 the renal tubular acidoses, cystinosis, Fanconi's syndrome 
and cystic fibrosis. Among disorders involving a genetic 
predisposition based on MHC antigen haplotype which may be 
potentially addressable by gene therapy are multiple 
sclerosis, ankylosing spondylitis, juvenile diabetes, 
25 rheximatoid airthritis and other autoimmune disorders. 

Besides gene therapy, there may be an interest in 
modifying various domestic animals for a variety of pxirposes 
to improve their capabilities for use in supplying food, 
e.g., milk, butter fat, leaner meat, etc., to provide for 
30 animals which may be used for scientific investigations, 
e.g. Class I MHC deficient mice, and the like, and to 
produce proteins on a large scale, e.g., albumin and the 
like. 

For these different interests, there will be 
35 different target cells for targeting for gene therapy or 
gene modification. Thus, one may b interested in modifying 
embryonal stem cells, somatic cells, hematopoietic stem 
cells, other stem cells, cells of connective tissue origin. 
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including myoblasts, osteoblasts, or chondroblasts; hepatic 
cells, endothelial cells, neiiral c lis, epitithelial cells, 
and cells of endocrine origin, including islet cells, or the 
like. The techniques and methodology used for modifying the 
5^ genotype of the target cells require that the modification 
provide the desired function • While one may rely upon 
random integration and selection of the clones of the 
integrants, selecting for function may not be sufficient. 
Random integration may result in a variety of situations, 

10 where the integrated DNA may be subject to regulation, 
depending upon the site at which it is integrated, where the 
functioning gene may become inactivated upon differentiation 
or proliferation of the cells, where the site of integration 
may result in a change in the functioning of one or more 

15 indigenous genes, or where the site of integration may lead 
to neoplasia. 

In addition, it is not presently understood what 
the effect of having DNA, particularly foreign DNA, more 
particularly prokaryotic DNA, will have on the functioning 

20 of the gene being introduced, particularly, where one is 
interested in homologous recombination to achieve a 
modification of an indigenous gene. It is important to 
understand what the effect may be on the functioning 
capability of the DNA which is introduced as well as the 

25 target gene which is to be modified. 

Relevant Literature 

Valancius and Smithies, Mol. Cell. Biol. (1991) 
11:1402-1408 describe an in-out targeting procedure for 
making genomic modifications in mouse embryonic stem cells. 

30 Hasty et al., Natvire (1991) 350:243-246 describe the 
introduction of a mutation into the HOX-2.6 locus in 
embryonic stem cells by in-out targeting. Rothstein, 
Targeting, Disruption, Replacement and Allele Rescue; 
Integrative DNA Transformation in Yeast, Methods in 

35 Enzymology (1991); 281-301 describes in-out targeting in 
yeast. Smithies et al. Nature (1985) 317:230-234 describe 
targeting the hximan )3-globin locus in a mouse 
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erythroleukemia hybrid cell line containing a single human 
chromosome eleven. Reports indicating that deletions or 
rearrangements involving the ends of targeting constructs 
and surrounding sequences sometimes accompany a gene 
5 targeting event may be found in articles by Doetschman, 
Maeda and Smithies, Proc. Natl. Acad. Sci. USA (1988) 
8528583-8587; Jason et al. , Genes Dev. (1990) A:157-166 and 
McMahon and Bradley, Cell (1990) 62:1073-1085. Also 
reported have been occasional secondary integrations 

10 accompemying an homologous recombination event, involving 
either the targeting constructed cells or cotransfected 
selectable DNA fragments. See Jason et al. and McMahon and 
Bradley, supra . See also Shulman et al. Mol. Cell. Biol. 
(1990) 10:4466-4472 and Stanton et al., ibid (1990) 10:6755- 

15 6758. 

SUM^Y OF THE DISCLOSURE 
Methods are provided for performing homologous 
recombination employing one or more selectable markers and 
a homologous region for introducing a modification in an 
indigenous chromosomal gene in a mammalian host cell. Two 
methods are employed for diminishing interference of the 
marker with the functioning of the target locus: (1) use of 
an n- (replacement) tcurgeting vector, which allows retention 
of the selectable marker (s) in such a manner that the mcurker 
does not interfere with the desired function or expression 
of the indigenous gene; and (2) use of an O- ( insert ional) 
targeting vector, which allows for excision of the 
selectable marker (s) . The resulting cells have the modified 
target locus without the indigenous sequence at the teurget 
site. 

DESCRIPTION OF THE SPECIFIC EMBODIMENTS 
Methods are provided for targeted modification of 
indigenous genes using vectors comprising extended homology 
35 with the target gene, but differing at at least one site in 
conjunction with a marker gene. A linear DNA construct is 
employed with two regions of homology to the target locus. 



20 



25 



30 
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Frequently, these regions will be proximal to the ends of 
the linear DNA molecule. The linear DNA molecule is 
transformed into the target cell by any convenient means and 
selection for integrants is performed using the selection 
5 , provided by the marker gene. The selected clones are then 
further screened for target genes having the desired 
modification. If desired, one may further screen for loss 
of the selection gene and other foreign DNA with retention 
of the target gene with the introduced modification. 

10 In targeting indigenous genes for modification, 

frequently sxibtle modifications, it is often important to 
ensure that the target locus is not modified in a way which 
interferes with the functioning of the target locus. When 
modifying indigenous DNA, one normally requires a selectable 

15 marker, which allows for selection of cells into which a 
construct has become integrated. The marker is normally 
integrated with the regions of homology and will be in close 
proximity to the region of homology. Therefore, methods 
must be devised which substantially ensure the functioning 

20 of the target locus. 

Two different strategies are employed. In the 
first strategy an n-vector is employed, where hybridization 
results in a loop or D-loop of the hybrid. The resulting 
integrant retains the marker and provides a functioning 

25 target locus. The selectable marker is situated in such a 
manner in which at does not significantly interfere with the 
function or expression of the target locus. For example, 
the marker may be located 5 » of the known local 
transcription regulatory sequences, within an intron or 3' 

30 of the coding region of the gene, etc. Usually the marker 
will be within 10 kbp, more usually within 5 kbp, of the 
gene. 

In the second method, in-out targeting, an 0- 
vector is employed where insertion results in two copies of 
35 the target locus homologous sequence: the indigenous 
sequence and the modified sequence. In a second step, an 
excision step, the indigenous sequence and marker (s) are 
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excised leaving only the modified sequence. Thus, the end 
p ints are different for the integration and the excision. 

The sxibject method can be used in a variety of 
ways for treating a variety of genetic diseases, mapping 
5 chromosomes, identifying loci, and the like. Of peurticular 
interest is the modification of dysfunctional genes, where 
the dysfunctional gene may be substituted with a functional 
gene. Thus, gene therapy may be carried out on a variety of 
types of cells, resulting in functional or modified genes 
10 from dysfunctional or xindesired allelic genes. In addition, 
one may wish to modify a phenotype by modifying the 
capability of the functional gene, enhancing or diminishing 
the level of expression, changing the spectrum of activity 
of a pleiotropic gene, changing a particular allele, as in 
15 the case of major histocompatibility antigens, T-cell 
receptor varieJDle regions, and the like. 

The DNA employed for targeting will have a region 
of homology with the target locus differing from the locus 
by a modification, which may be a substitution, deletion, 
20 insertion, or combinations thereof. Also included will be 
a marker gene which allows for selection. One or more 
unique primer sites may be present for subsequent PGR 
analysis. The homologous DNA will usually be not more than 
about 100 kbp, usually not more than about 20 kbp and 
25 usually more than about 0.5 kbp. 

The target cells may be any of a variety of 
vertebrate cells, particularly animals cells, more 
particularly mammalian cells, which may include any of the 
cells previously described. The constructs will comprise a 
30 region of homology associated with the target gene, where 
the region of homology may be noncoding, coding, or 
combinations thereof. A noncoding region may comprise the 
5' non-coding region, introns, and in some instances the 
3' noncoding region. The homologous region will noirmally 
35 encompass the modification, which may be a single site or a 
polynucleotide, usually not greater thcin about ten percent 
of the homologous region, e.g. 500 bp, usually not more than 
about five percent of the homologous region, e.g. 250 bp. 
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The modification will normally be bordered by a total of at 
least about 50 bp of homology, usually 100 bp and less than 
about 100 3cbp usually less than about 50 kbp. 

The restriction site which provides for the site 
5 of linearity of the DNA for the 0-type vector employed for 
integration will be desirably between the site(s) of 
modification and the shorter stretch of homology. 
Desirably, one may have a short gap in the homology at the 
termini. The missing sequence which is filled in during the 
10 targeting can provide a primer site, so that targeted 
integration may be readily detected. 

Depending on the nature of the vector, the 
organization of the functional sequences in the vector will 
vary. With the n-vector, the areas of homology will include 
15 regions flanking the marker (s) and the modified homologous 
region, where the flanking regions may or may not be 
immediately adjacent in the target locus. A double cross- 
over event is targeted resulting in replacement of the 
chromosomal region lying between the flanking homologous 
20 sequences. With the 0-vector, the regions of homology to 
the chromosome are usually adjacent in the chromosomal 
target locus. The homologous region includes the modified 
region. Cross-over events resulting in the integration of 
the vector are selected, since the termini of the linear 
25 construct, when joined, define a sequence of substantially 
continuous homology. This results in the formation of two 
target loci, indigenous and modified, with the markers 
between the two target loci. Upon excision, the indigenous 
locus and markers will be lost and the desired modification 
30 will be retained, provided that the excisional cross-over 
occvurs on the other side of the modification. The n-vector 
has an internal loop while the 0-vector has an external loop 
of non-homologous sequence. Thus, in the n-vector the 
termini are distant at the target locus, while in the o- 
35 vector the termini are proximal at the target locus. 

Various markers may be employed for selection. 
These markers include the HPRT minigene (Reid et (1990) 
Proc. Na tl. Acad. Sci. USA 87:4299-4303, the neo gene for 
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resistance to G418, the HSV thymidine kinase (tk) gene for 
sensitivity to gancyclovir, the hygromycin resistance gene, 
etc. As indicated above for the O- vector, by linearizing 
within the region of homology, the marker gene(s), with 
5 accompanying foreign DNA (by foreign is intended foreign to 
the target host) will be situated between the duplicated 
genes, where one of the genes will have the introduced 
modification, while the other will be the indigenous 
sequence. 

10 When carrying out in-out targeting, one may take 

advantage of using a marker that can be employed for both 
positive selection and negative selection of thee out step, 
e.g., hort . Alternatively, one may use separate markers for 
positive selection eg., neo, hygromycin resistance, etc., 

15 and for negative selection, eg., HSV-tK gene, cytosine 
deaminase, etc. The positive selection marker allows one to 
choose integrants lacking antibiotic resistance. Upon 
excision, the negative selection markers allows one to 
select against cells which retain the negative selection 

20 markers. 

When carrying out targeting with an n-vector, one 
may employ a negative selection marker situated outside of 
the f lemking homologous regions to enrich for double cross- 
over events. 

25 Other aspects of the construct may include 

sequences which allow for specific primer regions for 
polymerase chain reaction (PGR) identification of homologous 
recombinants, one or more restriction sites, which allow for 
identification by gel electrophoresis, removal of a 

30 restriction site at the target locus, or other modification 
which allows for identification of target cells which have 
tindergone the desired modification. In addition, the 
changes outside of the coding region should allow for 
retention of the transcriptional regulation region, unless 

35 some change in the transcriptional regulation region is 
desired. Therefore, the gene for selection, restriction 
sites, primer sites, etc. will desirably be 5' or 3« of the 
coding region or within introns. The integrated DNA 
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sequence will usually be at least about 0.5 kbp, m re 
usually at least about 1 kbp and usually less than about 
100 kbp, more usually less than about 50 kbp. 

Various techniques may be employed for introducing 
5 the linear DNA into the target cell. Techniques include 
electroporation, calcium precipitated DNA, fusion, 
transfection, and the like. The particular manner by which 
the DNA is introduced is not critical to this invention, 
although electroporation is preferred. 
10 Once the target cells have been transformed, the 

cells may then be selected by means of the marker gene. 
Thus, the cells may be plated in a selective medium or grown 
in selective cultxxre and clones identified for further 
investigation. Thus, where excision of the marker gene is 
15 not required, the clones may be analyzed using. PGR, 
employing primers which will provide for different sized 
fragments, depending upon whether homologous recombination 
has occurred and whether the modified gene or the wild type 
gene is retained or other event has occurred to modify the 
20 target gene. In this way, target cells which have undergone 
the desired modification may be identified. Alternatively, 
one may look to the expression product by using antibodies 
specific for the modified gene expression product. Thus, 
one may perform any one of numerous immunoassays for 
25 identification of the expression product. Where the gene 
expresses a surface membrane protein, one may use monoclonal 
antibodies in conjunction with FACS for identification of 
cells expressing the desired product. It is found that the 
presence of the marker gene at the target locus does not 
30 significantly interfere with the expression of a target 
gene, allowing for substantially normal expression of the 
target gene in a host cell. 

In some instances, one may wish to have the marker 
gene(s) removed along with the exogenous DNA. This may 
35 natixrally occur as the result of various excision 
mechanisms, such as intrachromosomal recombination or 
homologous excision via unequal sister chromatidic exchange. 
In ither event, a single copy of the gene will be obtained. 
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which will be either the indigenous gene or the modified 
gene. 

As already indicated, the sxibject methodology may 
be used for gene therapy and mammalian fine-structure 
5 genetic analysis. Genes which may be targeted for gene 
therapy include jS-globin, enzymes of erythrocyte metabolism, 
the complement system, coagulation factors, dystrophin, 
enzymes of carbohydrate, lipid, amino acid, steroid and 
purine and pyrimidine metabolism, transport proteins, e.g., 
10 cystic fibrosis transmembrane regulator, and the like. 

The following examples are offered by way of 
illustration and not by way of limitation. 

EXPERIMENTAL 

15 Example 1. Correction of a Human g'-Glob in Gene By Use of 
a Replacement Vector . 

Targeting construct Construction and Preparation* The 

targeting construct i84 . 7NE0 is a 4 . 7 kb BamHI/Xba 1 fragment 
that includes the iS^ globin gene and surrounding sequences. 

20 It also contains a 20 bp oligomer inserted at the Sph I site 
614 bp upstream of the start of the normal j3 globin 
transcript, and a 1.2 kb Xho I/Sal I fragment from pMClneo 
Poly A (Thomas and Capecchi (1987) Cell 51:503-512) (the 
neomycin-resistance gene in this particular version of pMCl 

25 neo Poly A, from Stratagene, contains a point mutation that 
reduces its ability to confer resistance to G418) dLnserted 
into a Bgl II site in the oligomer by blunt ending both 
oligomer and the insert DNA with Klenow polymerase. 

For use in electroporations , the targeting 

30 sequences were excised from the vector plasmid Bluescript+ 
by a Sma I/Xba I double digest. This leaves one base of 
nonhomology at the 5' end of the targeting construct. The 
DNA was precipitated with EtOH after digestion and. 
resuspended in phosphate buffered saline. 

35 Cell Lines Md Tissue Culture C nditions. The cell line BMS 
is a hybrid murine-humem cell line derived from the fusion 
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of miirine erythroleukemia cells (MEL 179, APRT-, available 
from Dr. A. Deisseroth) and hxoman EBV-transf ormed 
lymphoblasts derived from an individual heterozygous for 
HPFH-2 and for the )3' globin gene. Fusion between the two 
5 cell lines was carried out in the presence of PEG 
(polyethylene glycol) 15 (50% PEG 15 in 75 mM Hepes) , and 
hybrid selection was achieved in AA media (50 fM Adenine, 40 
/xM Alanosine) in which only hybrid cells are expected to 
survive. After 2-3 weeks of selection, these hybrids 

10 appeared as clonal outgrowths and were tested for the 
presence of human chromosome 11 using a monoclonal antibody 
against a chromosome 11-encoded antigen (Papayannopolou et 
al., (1986) Cell 46:469-476). Hybrids were maintained in 
non-selective media and were occasionally enriched for the 

15 presence of human chromosome 11 by immunoadherence 
("panning") to the monoclonal antibody. Cell line BSM 
carries only the copy of human chromosome 11 with the )3* 
allele, as judged by gene-specific polymerase chain reaction 
(PGR) amplification, and by deteraining the pattern of human 

20 globin expression after induction using previously reported 
methods (Papayannopolou et al. , (1988) Science 242:1056-58) 
Cell line PC4 was constructed as a positive 
control for PGR amplification with primers 1 and 2 (SEQ ID 
NOS: 1 and 2, respectively). The cell line contains a 

25 portion of the sequences in the targeting construct j34.7NEO 
(from the 3' end of oligomer used as a primer binding site 
for primer 2 (SEQ ID NO: 2) through the 5* BamHI site), plus 
further upstream sequences from the globin region. PC4 
therefore contains the binding sites for primers 1 and 2 

30 (SEQ ID NOS: 1 and 2, respectively) and probes A and B, but 
lacks the neomycin gene and the j3 globin gene and therefore 
lacks binding sites for primers 3, 4, and 6 (SEQ ID NOS: 3, 
4 and 6, respectively) and probe G. Clone PG4 was obtained 
by co-electroporating the construct into BSM cells with 

35 pMGlneo Poly A, followed by selection for G418 resistance 
and screening for PGR amplification of with primers 1 and 2 
(SEQ ID NOS: 1 and 2, respectively). 
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All cells were grown at 37 and 5% COj in RPMI- 
1640 (Gibco) with 13% heat-inactivated fetal calf serxim, 
supplemented with 2 mM L-Glutamine. G418 selection was 
ccurried out with 300 fig /ml G418 sulfate (Gibco) . 

5 Electroporation. Cells were fed with fresh medium the day 
prior to electroporation, and were harvested when at a 
density of approximately 10* cells/ml. For electroporation, 
they were resuspended at 2x10^ cells/ml in warm growth 
medium, and digested. pj34*7N£0 was added to a final 

10 concentration of 5nM. Electroporation was with 10^ cells in 
a chamber of 5 mm length and 100 mm^ cross sectioned as 
described (Boggs et al . , (1986) Exp. Hematol. 14:988-994)- 
The electric pulse, from a 400 /xF capacitor charged to 400V 
(800V/ cm) , was for one second. After electroporation, the 

15 cells were diluted into warm growth medium, and 10 ml was 
immediately plated into each of sixteen 100 mm diameter 
dishes at either 5x10^ or 1.5x10* cells/dish. Electroporated 
cells were also plated into microtiter plates at 100 
Ml/ well, at the same concentration and diluted threefold cuid 

20 tenfold. The next day an equal volume of medium containing 
600 ^g/ml G418 was added to the cultures. The number of 
G418-resistant clones in the 100 mm dishes was estimated by 
the Poisson distribution from the frequency of microtiter 
wells having no 64 18 -resistant cells. 



25 Probes. Probe A is a 218 bp Sty I/BamHI fragment from the 
genome just 5' of the targeting construct sequences. Probe 
B is a 627 bp Hpa I fragment from the 5* region of the 
targeting construct. Probe C is a 920 bp BamHI/EcoRI 
fragment covering the human j8 globin IVS2 region. 

30 An RNA probe assaying human globin gene 

transcription was transcribed in vitro from a genomic 
0.77 kb EcoRI/Pst I fragment cloned in the antisense 
orientation into a T7 vector; it contains the 3' end of the 
human ^ globin gene and has 212 bases of homology to the 

35 transcript. The mouse probe was transcribed from a 0.65 kb 
BainBl/Sinfl cDNA-derived fragment cloned into an Sp6 vector; 
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it has 298 bases of homology to the mouse gene 
transcript . 

PCR. Polymerase chain reactions were as previously 
described (Kim and Smithies (1988) Nucleic Acids Res, 
5 16:8887-8903) except that the MgClj was 2mM. Screening of 
pools and individual clones was with 15 til crude cell 
lysates, corresponding to approximately 10^ cells, in a 
total reaction volume of 25 /il* Forty cycles were executed 
with denaturation for 1 minute at 90**C, and extension for 10 

10 minutes at 60 «C. Samples were analyzed by electrophoresis 
in 1.5% agarose gels followed by Southern blot analysis. 
Allele-specif ic PCR was for 30 cycles with 500ng of genomic 
DNA with primers and reaction conditions as described (Wu et 
ai. (1989) Proc. Natl. Acad. Sci. USA 86:2757-2760) , 

15 Samples were analyzed by electrophoresis in 6% acrylamide 
gels. Primer sequences are: primer 1 (SEQ ID N0:1) = 
5 » -CCCAGACACTCTTGCAGATT-3 • ; primer 2 ( SEQ ID NO : 2 ) 
5 • -CAGATCTGGCTCGAGGCATG-3 • ; primer 3 (SEQ ID NO: 3) 
5 • -TGCGCTGACAGCCGGAACAC-3 • ; primer 4 (SEQ ID NO: 4) « 

20 5 • -AATAGACCAATAGGCAGAG-3 * ; primer 5 (SEQ ID NO: 5) = 
5'-CACCTGACTCCTGT-3 » ; primer 6 (SEQ ID NO: 6) = 
5 • -CACCTGACTCCTGA-3 » . Note that primer 2 (SEQ ID NO: 2) has 
1 bp of nonhomology to the targeting construct at its 5* 
end. 



25 Globin Analysis. The induction of globin synthesis and 
antibody labeling were as described (Papayannopolou et al. , 
(1986) supra 1 with a jg* globin-specif ic monoclonal antibody 
(Papayannopolou et aJL. , British J. of Hematology 35:25-31) 
and a more general human p globin-specif ic monoclonal 

30 antibody (Stammatayanopolous et ai. (1983) Blood 61:530- 
539) . 

The slot blot hybridization assay used to assess 
relative levels of both murine and human globin cytoplasmic 
RNA in cells has been described (Constantoulakis et al. 
35 (1989) giood 74:1963-1971) . 
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Gen Targeting. Gene targeting was used to correct the /3' 
gl bin gene on a htaman chromosome 11 in mouse 
erythroleukemia hybrid cell lines BSM. The jS^ globin 
replacement (or n) type targeting construct, i54.7NEO, has 
5 4.7 kb of sequences homologous to the human jS^ globin 
region; it also contains a unique oligomer for use as a PGR 
primer binding site (primer 2) (SEQ ID NO: 2), and a 
neomycin-resi stance gene, both placed 5' of any known local 
i8 globin transcriptional regulation sites. The neomycin 

10 gene is a 1.2 kb fragment derived from pMClneo PolyA and is 
driven by the herpes simplex thymidine kinase (tk) promoter 
plus the duplicated mutant polyoma virus enhancer originally 
designed for use in mouse embryonic stem cells (Thomas and 
Capecchi (1987) , supra ) • 

15 The targeting construct was introduced into BSM 

cells in a series of eight electroporations . After 
electroporation, the cells were diluted, and a total of 
4.1x10*' were immediately plated into tissue culture dishes. 
The following day they were placed under G418 selection 

20 which yielded 126 pools of between 10 and 1000 G418- 
resistant clones per dish (average about 200), each clone 
having incorporated the targeting construct somewhere in the 
genome. 

Detection of Targeted Clones. Upon completion of selection, 
25 and growth to adequate density (10 to 20 days), a small 
portion of each pool was removed and tested by a PGR assay 
(Kim and Smithies (1988) Nucleic Acids Res. 16:8887-8903) 
for the presence of a targeted clone. Two primers were 
used: primer 1 (SE ID NO:l) is specific to the target 
30 locus, since it is from approximately 400 bp upstream of the 
5' end of the targeting fragment; primer 2 (SEQ ID NO: 2) is 
specific to the incoming DNA, since it is from the synthetic 
oligomer sequences present only in the targeting DNA. 
Specific PGR amplification to give a 1.2 kb diagnostic band 
35 hybridizing to probe A can only occur if a targeting event 
juxtaposes these two primers. The PGR results led to the 
identification of a pool containing a targeted clone. 
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Although the PGR generated two hybridizing bands, rather 
than the expected single band, the pattern is still 
indicative of a targeted recombinant since the positive 
control cell line PC4 yields the same two bands, while the 
5 parental BSM cells yield neither. In all, a total of three 
PCR positive pools were identified among the 126 pools 
examined. This corresponds to a targeting frequency of one 
targeted clone in about 9,700 G418-resistant clones. 

The PGR analysis included positive control lysates 
10 made from mixtures of PC4 cells with untreated BSM cells at 
ratios of 1 to 10 and 1 to 100. PC4 is a pseudo-recombinant 
cell line which contains integrated copies of foreign DNA 
having primers 1 and 2 (SEQ ID NOS: 1 and 2, respectively) 
already juxtaposed. This positive control proved essential 
15 for working out appropriate PCR conditions, but introduced 
the risk that false positives might arise from contamination 
of nontargeted cells with the diagnostic 1.2 kb fragment 
from amplified controls. The possibility that the positives 
were artifactual contaminants of this type was excluded by 
20 carrying out a second set of PCR amplifications using a 
neomycin gene-derived primer (primer 3) (SEQ ID NO: 3) in 
place of primer 2 (SEQ ID NO: 2). All three pools that the 
test PCR assay had indicated contained a targeted clone 
yielded the expected 1.6 kb band that hybridized to probe A 
25 during this second PCR reaction; as expected, the positive 
control PC4 cells did not. 

Isolation of a Targeted Clone. Sib-selection was used to 
isolate a clone of targeted cells from one of the PCR- 
positive pools that contained about 200 independent G418- 

30 resistant clones. Cells from this pool were diluted into 96 
smaller pools with approximately 10 cells in each. After 
expansion, these smaller pools were rescreened by PCR for 
the presence of the diagnostic 1.2 kb band. Three of the 
smaller pools gave a positive PCR result, and the signal 

35 level corresponded to that expected for mixtures having one 
targeted cell for every ten nontargetted . 
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One of the enriched pools was then diluted into 
microtiter dishes so that only 10 to 20% of the wells 
received a cell. After growth to adequate density, samples 
were removed from 66 wells containing growing cells from PGR 
5 amplification. Three of the 66 clone wells showed the 
1.2 kb diagnostic band at the same level as undiluted PC4 
control cells, and were therefore presiimed to contain clones 
of targeted cells. 



southern Blot Analysis* One isolate of the presiimed 

10 targeted clone was expanded, and used to prepare genomic DNA 
for Southern blot analysis. When the 5" probe B was used, 
a 12.8 kb band generated by Pvu II digestion of genomic DNA 
from the starting BSM cells is replaced by a predicted 
6.9 kb in the targeted cells; likewise a parental 5.5 EcoBl 

15 fragment is replaced by a predicted 3.7 kb band. Similarly, 
with the 3* probe C, a parental 5.0 kb Bgl II band becomes, 
as predicted, 4.0 kb after targeting, and, again as 
predicted, the parental 12.8 kb Pvu II band becomes the 
predicted 7.1 kb band. 

20 Southern blot analysis was also used to 

investigate the fidelity of the targeting event. A BamHI 
site that begins 1 bp from the 5" end of the targeting 
construct is still intact after targeting, as shown by the 
presence of a predicted 2 kb fragment that hybridizes to 

25 probe B following SamHI digestion of DNA from the targeted 
clone; the parental BSM DNA gives a 1.9 kb band. Likewise, 
the 3» Xba I site at the 3' end of the targeting construct 
was shown to be intact by the presence of a 4 . 5 kb band that 
hybridized to probe C following an Xba I/Pvu II double 

30 digest of the targeted clone, compared to the 10 kb band 
from parental BSM DNA. These hybridization patterns 
establish that no end deletions have occurred zmd also 
re-confirm the targeting event itself. 

The Southern blots, and compsorable blots using 

35 ne mycin- and vector-specific probes from the targeting 
plasmid, show that the genome of the targeted clone contains 
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only a single copy of the targeting construct integrated at 
the desired location. 

Confirmation of Correction to the allele. Two 

independent methods were used to demonstrate that the jS* 
5 allele had been changed to 0^ by the targeting. First, 
allele-specific PGR was performed on various cells using 
primer 4 (SEQ ID NO: 4) together with either the j8"-specific 
primer 5 (SEQ ID NO: 5), or the jS^-specific primer 6 (SEQ ID 
NO: 6), The parental (jS'-containing) BSM cells, as expected, 

10 yielded no amplified band when the jS^-specif ic primer 6 (SEQ 
ID NO: 6) was used, but a band was obtained with the 
jS"-specific primer 5 (SEQ ID NO: 5). A pool of G418- 
resistant, but nontargetted cells yielded an amplified beind 
when either primer set was used, although a more intense 

15 band was observed with the jS'^-specif ic primer, than with the 
jS«-specific primer. This result is expected because the 
human chromosome 11 is not present in all of the hybrid 
cells, whereas a randomly integrated targeting construct 
must be present in all cells (in order to obtain G418 

20 resistance), and often occurs in multiple copies. The 
targeted clone, again as expected, amplifies a band when the 
)8^-specific primer 6 (SEQ ID NO: 6) is used, but not with the 
jS'-specific primers (SEQ ID NO: 5). These observations 
establish that the targeted clone now contains a jS^ globin 

25 gene, but no longer carries the jS' gene. 

Secondly, allele-specific antibodies were used to 
investigate the globin polypeptides synthesized after the 
parental BSM cells and the targeted clone have been induced 
with HMBA (hexamethylenebisacetamide) . An antibody specific 

30 for human jS globin, but unable to discriminate between jS" 
and globin, shows the presence of some type of human ^ 
globin in both the induced BSM cells and the induced 
targeted clone. In contrast, a human jS'-specific antibody 
binds to the globins produced by the induced parental BSM 

35 cells, but not by the targeted clone. These observations 
establish that although the targeted clone is able to 
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synthesize human globin, it can no longer synthesize jS* 
globin. 

Regulation of Globin Expression. A slot blot hybridization 
assay with probes specific for murine 0°^ globin and human 
5 p globin was used to ask whether or not the targeted clone 
had its ability to undergo htaman globin induction 
altered. Specifically the ratios of induced to uninduced 
cytoplasmic RNA for both murine jS"^ and humsm jS globin 
transcripts within cell line BSM was determined and found to 

10 be essentially identical (17:1 and 15:1 respectively). 
Similarly, the ratios of induced to uninduced levels of 
murine p°^^ compared to human jS globin within the teurgeted 
clone were also essentially identical (7:1 and 5:1 
respectively) • Since the inducibility of the murine and 

15 human jS globin genes are essentially the same within the 
polyclonal parental BSM cells, and within the teirgeted 
clone, we conclude that the targeting event has not 
significantly altered the ability of the (now corrected) 
hxaman gene to respond to induction. 

20 The above results demonstrate a frequency of 

targeting of at least one targeted clone for 9,700 G418- 
resistant clones. The Southern Blot analysis rigorously 
established that the isolated clone had been teurgeted as 
planned and verified the fidelity of the targeting event in 

25 the isolated clone. No detectable secondary events occurred 
along with the gene targeting event. Human globin genes 
introduced on their native human chromosomes introduced into 
the BSM cells by somatic cell fusion are regulated and 
expressed after induction in a manner comparable to that 

30 shown by the endogenous mouse globin genes (Mcirks and 
Rifkind (1978) in Ann. Rev. Biochem. 47:419-448; Willing et 
al. (1979) Nature 277 :534-538; Deisseroth cmd Hendrick 
(1979) Proc. Natl, Acad. Sci. USA 76:2185-2189. The results 
demonstrate that gene targeting can correct a human jg* 

35 globin gene to p^, an essential requirement before gene 
targeting can be considered for human gene therapy. 
Furthermore, the induction ratio of the corrected gene was 
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not significantly altered by the introduction of a neomycin- 
resistance helper gene into the target locus to facilitate 
identification and isolation of the targeted clone. 

Example 2> The Use of In-Out for Making Subtle Genomic 
5 Modifications in Mouse Embryonic Stem Cells. 

Cell culture. The mouse ES cell line E-14TG2a was isolated 
as described previously (Hooper et (1987) Nature 

326:292-295; Thomas and Capecchi (1987), supra ) . Cells were 
grown in Dulbecco's modified Eagle's medium (GIBCO) 

10 supplemented with 15% heat-inactivated fetal calf serum 
(Flow) and 10 juM 2-mercaptoethanol (Sigma) . The 
pluripotential nature of the ES cells was retained by 
supplementing each liter of growth medium with 10* U of 
recombinant human leiikemia inhibitory factor (available from 

15 N. Gough, Walter and Eliza Hall Institute, Melbourne, 
Victoria, Australia) . Because feeder layers were not used, 
all culture dishes were coated with 0.1% sterile gelatin to 
ensvire cell adhesion. HAT medium was standard culture 
medium supplemented with 120 /xM hypoxanthine, 0.4 /xM 

20 aminopterin, and 20 m thymidine. 6-TG (thioguanosine) 
selection was carried out in standard medixim containing 
10 juM 6-TG. Cultures were incubated at 37 «C in an 
atmosphere of 5% COj. They were checked periodically for 
mycoplasma contamination. 

25 Vectors. Plasmid pNMR133 has already been described 
(Doetschman et ai*/ (1987) J. Embrvol. Exp. Morphol . 
330:576-578) . It contains 5 kb of DNA identical to the exon 
3 target region of the mouse HPRT gene, except for a 4 -bp 
insertion that destroys a unique Hin dlll site and 

30 consequently generates a new Nhe l site in intron 2. It also 
carries the human HPRT promoter and exon 1 sequences (which 
have been shown to function in mouse cells) and the mouse 
exon 2 region. 

Plasmid pNMR133D200 was derived from pNMR133 by 

35 removing a 200-bp Balll fragment from intron 2. 
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DHA preparati n. Targeting vector DNAs were prepared by 
standeird methods, omitting the CsCl purification, which was 
found unnecessary- All targeting DNAs were linearized by 
restriction enzyme digestion, using the manufacturers" 
5 recommended conditions, prior to electroporations . Digested 
DNAs were ethanol precipitated and resuspended in sterile TE 
buffer (0.05 M Tris, 0.001 M EDTA) . 

DNA transfers and selections. The vectors were introduced 
into the ES cells by electroporation (Boggs et al. (1986) 

10 Exp, Hematol. 14:988-994). The cells were grown in 100-mm 
culture dishes (as described above) to a density of 1 x lo'' 
to 2 X 10^ cells per dish in nonselective medium. Cultures 
were trypsinized, centrifuged, and then resuspended in 
nonselective medium to a density of 4 x lo'' to 10 x lo'' cells 

15 per ml. A 0.5-ml sample of the cell suspension was added to 
each microfuge txibe, and prepared DNA was then added to a 
final concentration of 5 nM. The cell-DNA mixtures were 
incubated on ice for 20 min, loaded into an electroporation 
chamber precooled on ice (length, 5 mm; cross section, 100 

20 mm^) , and exposed to a 1-s electrical pulse from a 250-/iF 
capacitor charged to 300 V. Cells were immediately removed 
from the chamber and plated into five 100-mm cultures 
dishes. The plates had been prepared by gelatinization and 
contained 7 ml of nonselective medium. The cells were 

25 allowed to recover overnight. The next day, the number of 
colonies in each dish was determined by counting, and HAT 
selection was then applied. 

Cultures to be assayed for the loss of HPRT 
function by selection in 6-TG were maintained under HAT 

30 selection for at least 1 month prior to the start of the 
assay in order to kill any accumulated hprt" cells. These 
cultxires were trypsinized, counted, and then replated at a 
density of 0.5 x lo'' to 1 x 10^ cells per plate in 
nonselective medium. They were without selection for 3 or 

35 4 days to allow spontaneous revertants time to purge 
residual HPRT transcripts or protein. Selection was then 
started by applying 6-TG medium. 
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All selections were maintained for 16 days, with 
feeding as necessary. Targeting and reversion frequencies 
were determined by counting the number of resistant colonies 
obtained for each experiment. Individual colonies were 
5 picked by using cloning rings into 24-well (l ml per well) 
dishes and maintained under selection. Cultures were 
transferred to 60-mm culture dishes and then either 
harvested for genomic DNA preparation or transferred to 100- 
mm dishes for further expansion. 

10 Genomic ONA preparation and characterization* DNA was 

prepared from expanded clones by using conventional 
procedures. Restriction enzyme digestions were done 
according to manufacturers' specifications, incubating 
overnight. After electrophoresis on 0.8% agarose gels, 
15 Southern blotting was done by standard techniques. 

Probes. Two probes were used, a 250-bp Rsa l fragment from 
intron 3 and a 3 00-bp Hindlll-Xhol fragment from the human 
cDNA which includes exons 3 to 6 but is specific for the 
mouse exon 3 element (Doetschman et a^l* , (1987) Nature 

20 330:576-578) . Both probes hybridize to sequences present in 
the endogenous locus as well as on the targeting vectors. 
For each blot, 25 to 50 ng of purified fragment was 
radiolabeled • with ^^P-dCTP by the random-primed 
oligonucleotide method, using a Boehringer Mannheim kit. 

25 Four-hour prehybridizations and overnight hybridizations 
were done in 50% formamide solutions at 42 «C. Blots were 
washed to a stringency of 1 x SSC (0.15M NaCl plus 0.015 M 
sodium citrate) at 68*»C. Washed blots were exposed to 
preflashed XAR-5 film at -70**C. 

30 In step: the integration event. The first step in the two- 
step targeting procedure is a homologous integration event 
that incorporates vector DNA carrying the desired 
modification into the genome. The method of (Doetschman et 
al. (1987) supra ) , was used to introduce into mouse ES cells 

35 an integrating targeting vector that carries a 4 -bp 
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insertion in the second intron of the HPRT gene. Either 
plasmid pNMR133 or plasmid pNMR133D200 (which has a 200-bp 
gap in the region homologous to the target locus) was 
electroporated into the male mouse-derived ES cell line E- 
5 14TG2a* This cell line, isolated by Hooper et al. ((1987) 
supra ) , as a spontaneous mutation in culttire, carries a 
nonreverting deletion of the promoter and first two exons of 
the nine-exon, 33-kbp, X-linked HPRT gene (Thompson et al. , 
(1989) Cell 56:313-321) rendering it phenotypically hprt- 

10 Both targeting vectors contain approximately 5kb of DNA 
identical in, sequence to the exon 3 target region of the 
hprt'gene except for the intended modification: a 4-bp 
insertion in intron 2 that destroys a unique Hindlll site. 
They also carry the human HPRT promoter and exon 1 sequences 

15 and the mouse exon 2 region. 

The homologous integration event generates a 
duplication of the 5-kb target region separated by the 
remainder of the vector sequences. The duplicated regions 
sure identifiable with the exception of the 4-bp insertion, 

20 identified by a missing Hin dlll site, that is located on the 
downstrecua repeat. This event restores the promoter and 
first two exons deleted from the locus, generating HPRT+ 
targeted recombinants that can be directly selected with 
HAT-containing medium. 

25 Three independent HPRT+ cell lines were isolated 

by selection in HAT medium at an average frequency of 2.8 x 
10"* per electroporated cell and it was then confirmed that 
these cell lines were targeted by genomic Southern blot 
hybridization. The blots were probed either with a 250-bp 

30 Rsal fragment from intron 3 or with a 300-bp HindIII-23l2l 
fragment from the human cDNA that specifically hybridizes to 
the mouse exon 3 . Both probes hybridize to sequences found 
in the genome as well as on the targeting vectors. All of 
the cell lines examined contained the expected recombinant 

35 locus, indicating that a single copy of the targeting 
vectors had integrated into the E-14TG2a hprfgene. 

Cell lines A and C hybridized to the single 191cb 
Hindlll fragment expected for a simple insertion of the 
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12-kb vector into the 7 kb endogenous fragment. Cell line 
D hybridized to two Hin dlll fragments, the endogenous 7-kb 
and the vector 12-kb fragments. This cell line, generated 
with plasmid pNMR133D200, has lost the 4*bp insertion as a 
5 consequence of the integration event, so that revertants 
obtained from this line could not be properly modified. 
However, it was used in the excision experiments (see below) 
since it could still generate useful information about the 
frequency and accuracy of the excision reaction. BamHI 

10 digestion of all recombinants revealed the expected 9.4-kb 
endogenous band and the 6.9-kb vector-derived band. No 
extraneous bands could be detected, confirming that all of 
the recombinants carried single-site, single-copy insertions 
of the vector DNAs. In addition to these three lines, one 

15 more cell line, B, generated previously in the laboratory 
(Doetschman et al . , (1987) supra ) was used in the excision 
studies (see below) . This line carries the same recombinant 
locus found in lines A and C. 



Out step: the excision event. The second in the two-step 

20 targeting procedxire is a spontaneous event that excises from 
the genome the vector sequences that integrated in the first 
step. A homologous recombination event between the regions 
duplicated during the in reaction can occur by either 
intrachromatid recombination (Doetschman et al- / (1987) 

25 supra ) or vinequal sister chromatid exchange. A crossover 
event in the 2-kb region comprising the 5 '-terminal portion 
and the Hin dlll site will leave the 4 -bp insertion in the 
genome; crossing over in the 3 kb region comprising the 3'- 
terminal portion will excise the 4-bp modification along 

30 with the vector sequences or move it to the (HPRT+) 
triplicated chromosome. Either way, the excision event 
removes the vector-derived promoter and first two exons, 
causing a reversion to the hprt'phenotype. Such revertants 
can be selected with the nucleoside analog 6-TG. 

35 The four HAT cell lines described above were used 

to study the excision (out) reaction. They all carry 
essentially the same HPRT locus: a duplication of 5 kb 
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separated by 7 kb of plasmid-derived unique sequence. ES 
cell line D-3 (Doetschman (1985) J. Embrvol. Exp, M orphol, 
82:27-45), which carries the wild-type HPRT locus, was used 
as a control in these experiments to determine the 
5 spontaneous rate of mutation from HPRT+ to hprt' at the 
normal locus. The experiments were performed as described 
above. The day after replating, the number of colonies 
observable in each dish was determined by counting. 
Typically, ES cells form one colony for every 5 to 10 cells 
10 plated (Table 1) . 

TABLE 1 

Frequency of the Out Reaction 
Colonies 



15 


Cell Line 


Cells 
Plated 
(10'') 


25 h 
Post 

rio^r 


Reversion 
e~TG^ Frequency 
Colonies** x 10"^ 




A 


2.90 


3.2 


23 


7.9 




B 


2.2 


3.2 


13 


5.9 




C 


4.2 


6.4 


14 


3.3 


20 


D 


3.8 


4.0 


56 


14.7 




Total 


13.1 




106 


8.1 




D-3 (control) 


5.6 


4.8 


0 


<.18 




* Number of 


colonies counted the 


day after 


replating. 



reflecting a 10 to 20% plating efficiency. 



25 ^ Number of colonies counted after 2 weeks of 6-TG 
selection. 

Number of 6-TG*' colonies obtained per plated cell. 

This is due to their propensity to form aggregates, not to 
a high death rate. That is to say, each colony found the 
30 day after replating is composed of 5 to 10 individual cells. 
Although this aggregation may interfere with the 6-TG 
selections as a result of metabolic cross-feeding (Hooper et 
al. , (1981) Int. Rev. Cvto. 69:45-104), it cannot be 
avoided. 

35 The number of 6-TG' colonies obtained for each 

line examined and calculated reversion frequencies are 
listed in TeODle 1. As shown, all four HAT lines initially 
generated by gene targeting reverted to the hprt* phenotype 
at similar frequencies, averaging 8 x 10*^ 6-TG' colonies 
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isolated for every HAT cell plated. Control cell line D-3, 
which carries the wild-type HPRT locus, failed to produce 
any colonies from 5.6 x lO'' cells plated. Thus, th 

spontaneous mutation frequency at the HPRT locus, for cells 
5 .preselected with HAT, is less than 1.8 x 10"*. This result 
is consistent with the rate of 1.5 x 10"* per cell generation 
reported for the locus by Caskey and Kruh, (1979) Cell 16:1- 
19. 

Several of the individual 6-TG'^ colonies were 
10 analyzed further by genomic Southern blot hybridization. 
Using the HPRT-specif ic probes described above. The number 
of colonies examined from each line and a summary of the 
results obtained from the Southern blot hybridizations are 
presented in Table 2. 

15 

TABLE 2 

Accuracv of the Out Reaction 

Colonies Accurate Revertants with 

Cell Line examined revertants* 4 -bp insert** 

20 



A 3 2 2/2 

B 9 7 6/7 

C 11 11 11/11 

D 3 3 NA 

25 Total 26 23 19/20 



• Number of colonies containing the expected hprt' 
locus . 

Number of hprt* colonies that retain the 4-bp 
30 insertion presented as a fraction of the number of 
accurate revertants obtained. 

NA. Not applicable: this cell line does not carry the 
4-bp insertion. 



35 Of a total of 26 6-TG^ colonies examined, 23 (88%) 

had executed the out reaction and accurately excised the 



wo 92/20808 PCr/US92/04054 

-26- 

integrated vector sequences from the genome, as determined 
by the genomic Southern blots- They all revealed a single 
9 .4-^3 BamHI band upon hybridization, the size predicted for 
a simple homologous excision event. This is the same BamHI 
5 fragment found in the parental E-14TG2a hprt' locus. Hin^III 
digestion of the revertant DNAs is expected to reveal one of 
two bands upon hybridization: either an 11-kb fragment, if 
the crossover occurs in the 5 ■ region and the 4 -bp insertion 
introduced by the in event is retained, or a 7-kb fragment, 

10 if the crossover occurs in the 3 ' region and the 
modification is removed from the hprt* genome. 

Of the 23 out revertant s examined, 20 were derived 
from HPRT+ cell lines that carried the 4 -bp insertion 
initially introduced by the targeted integration event; 19 

15 of these colonies contain the single ll-kb Hindlll fragment, 
indicative of the accurate excision event which retains the 
4-bp insertion. Thus, these 19 colonies have been correctly 
modified by the in-out targeting procedure. Only 1 of these 
20 revertant colonies lost the 4-bp modification, as 

20 determined by the presence of a 7-kb Hindlll fragment. 
Therefore, 95% of the acctirate revertants which could have 
retained the 4-bp insertion did so. The three remaining 
colonies which were f o\md to have been generated by the out 
reaction were derived from the HAT cell line D. As this 

25 cell line does not carry the 4-bp modification, the 
revertants revealed only the 7-kb Hindlll band upon 
hybridization . 

To confirm that the 11-kb Hindlll band 
characterizing the acciarately modified hprt' revertants 

30 retained the 4-bp insertion initially introduced by the 
targeting vector, two of the genomic DNAs were digested with 
Nhel site. The 4-bp insertion introduced to destroy the 
Hind lll site in the original targeting vector generates a 
unique Nhel site. In the case of the revertants that have 

35 retained the 4-bp insertion {11-kb liindlll) , Nhel digestion 
will generate a 2. 7-kb band which hybridizes to the probes. 
In the case of the revertants which have lost the insertion 
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(7-kb fiindlll) , a 4.9-kb band will result. The accurately 
modified revertant does contain a 2. 7-kb Nhel fragment which 
hybridizes to the probe, confirming the presence of the 4-bp 
insertion, and the revertant which has lost the 4-bp 
5 insertion reveals a 4.9-kb fragment upon hybridization. 

The other three colonies examined were found to 
contain aberrant hprt" loci that did not arise by the 
predicted homologous excision reaction. They contained a 
single l4-kb Hinaill fragment and a 16-kb Bam HI fragment 
10 that hybridized to the probes. These fragments failed to 
hybridize to a plasmid-specif ic probe, indicating that the 
target vector sequences have been excised from the genome. 
Since the bands are not the expected sizes, these colonies 
were probably generated by an alternate excision reaction. 
15 Because they account for only 12% of the 6-TG' colonies 
obtained, they were not examined further. 

The above data demonstrate the successful in-out 
targeting in modifying the genome of a mouse ES cell line by 
introducing a 4-bp insertion, in a two step procedure, the 
20 second step being automatic. The average frequency of the 
in reaction was found to be 2.8 x 10"*. The frequency of the 
out reaction, 8 x 10"' per HAT' cell plated is approximately 
30% that of the in reaction. This frequency is 40-fold 
higher than the spontaneous mutation rate at the normal HPRT 
25 locus. Of the 6-TG^ colonies isolated, 88% had accurately 
excised the target vector sequences from the genome. 

It has been shown that metabolic cross-feeding can 
interfere with 6-TG selections; therefore, the reversion 
frequency determined may be an underestimate. Because the 
30 ES cells always form aggregates upon plating, the 
possibility of such cross-feeding could not be eliminated. 
This suggests that when using the in-out targeting procedure 
in other cell lines, it may be useful to plate HAT' cells at 
a lower density to minimize the potential of revertant loss 
35 due to such cross-feeding. 

By not growing the ES cells on feeder layers, the 
pluripotential nature of the cells was retained. This was 
achieved by supplementing the growth medixim with purified 
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hiiman leukemia inhibitory factor, (Smith et.al. , (1988) 
Nature 336 ; 688-690; Williams et.al. , (1988) Nature 336:684- 
687) . This modification greatly simplifies the 2-step 
selection procedure. 
5 The above results show that both the integration 

and excision events can occur accurately and with a 
frequency sufficient for use in a 2-step targeting 
technique. While the above procedure employs the directly 
selectable the HPRT locus, the same procedure could be 

10 adaptcOile to modify non-selectable loci in an liErt'cell line 
by using the HPRT minigene described by Reid et.al., (1990) 
Proc. Natl, Acad. Sci. USA 82:4299-4303. The minigene would 
be carried on an integrating targeting vector, thereby 
allowing selection to be used for both the integration and 

15 excision events. Homologous recombinants are likely to be 
found after the in step at a frequency of 1 in 1,000 HAT' 
cells, this being the ratio of transformed to targeted cells 
reported previously. The targeted cell lines can then be 
identified by the polymerase chain reaction, for example, or 

20 other means, depending upon the nature of the targeting gene 
and the modification. Including a small gap in the region 
of homology on the insertional vector provides a convenient 
primer binding site, since all gaps are repaired during the 
homologous insertion event. Excision-derived hprt" 

25 revertants are likely to be found after the outstep at a 
frequency of nearly 1 in 10* per targeted cell line. 

The above techniques allow for modification of 
genomes of viable cells, particularly embryonic cells, where 
a stable modification can be achieved, which can be 

30 inherited by progeny cells. In addition, the modifications 
can be subtle and fvinctional target genes achieved even when 
a marker is allowed to remain in the genome. Thus, the 
subject invention demonstrates the feasibility of gene 
therapy with stem cells or other cells, which can be used 

35 for the treatment of a variety of genetic or other diseases. 

All publications and patent applications mentioned 
in this specification are herein incorporated by reference 
to the same extent as if each individual publication or 
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patent application was specifically and individually 
indicated to be incorporated by reference. 

The invention now being fully described, it will 
be apparent to one of ordinary skill in the art that many 
5 changes and modifications can be made thereto without 
departing from the spirit or scope of the appended claims. 
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(E) COUNTRY s USA 
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(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 
CCCAGACACT CTTGCAGATT 20 
40 (2) INFORMATION FOR SEQ ID N0:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
45 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
CAGATCTGGC TCGAGGCATG 20 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



10 <xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 

TGCGCTGACA GCCGGAACAC 20 

(2) INFORMATION FOR SEQ ID NO: 4: 

: (i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 19 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
20 AATAGACCAA TAGGCA6AG 19 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
CACCTGACTC CTGT 14 
30 (2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
CACCTGACTC CTGA 



14 
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WHAT IS CLAIMED IS : 

1. A method for introducing changes at a target locus 
in a chromosome of a viable mammalian cell, said method 
comprising; 

5 transforming said cell with a linear DNA construct 

comprising a sequence having at least 50 bp of homology with 
an indigenous region of said tcirget locus, said homology 
comprising a sequence different from said target locus, and 
a marker gene allowing for selection of cells comprising 

10 said mcurker gene, wherein said DNA construct is an n- 
targeting vector or an 0-targeting vector, wherein a non- 
homologous sequence forms an internal loop or an external 
loop , respectively ; 

growing said transformed cells in selective medixim to 

15 provide marker gene containing cells; and 

isolating cells comprising said change in said 
indigenous region by identifying the presence of said 
construct sequence at said locus, wherein when said 
construct is an n-vector said non-homologous region is at a 

20 site which does not substantially interfere with the 
functioning of said target locus. 

2. A method according to Claim 1, wherein said linear 
DNA construct is an 0-vector and is linearized in the 
homologous sequence and has a gap in homology with said 

25 indigenous region, when the ends of said linear sequence are 
joined. 

3. A method according to Claim 2, wherein said gap 
defines a primer sequence, and the polymerase chain reaction 
is vised to identify said subtle change by using a primer 

30 complementary to the homologous sequence at the gap. 

4. A method according to Claim 1, wherein said 
transforming is by electroporation. 
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5. A method according to Claim 1, wherein said vector 
is an 0-vector and including the additional step of growing 
said isolated cells for sufficient time for the indigenous 
region and markers of said target locus to be excised; and 

5 . identifying cells comprising the homologous sequence 
and lacking said indigenous sequence, but retaining said 
modification, by means of a marker allowing for negative 
selection. 

6. A method according to Claim 1, wherein said 
10 mammalian cells are embryonic cells. 

7. A method according to Claim 1, wherein at least 
one terminus of said linear DNA construct comprises at least 
about 50 bp of homology with said indigenous region. 

8. A method according to Claim 1, wherein said target 
15 locus comprises a defective globin gene and said construct 

comprises a functional globin gene. 

9. A method for introducing a change in a gene at a 
target locus in a chromosome of a viable mammalian cell, 
said method comprising; 

20 transforming said cell with a linear DNA construct 

comprising an 0-targeting vector having (1) a sequence of at 
least 50 bp of homology with an indigenous region of said 
gene and differing from said indigenous region and (2) one 
or a combination of marker genes allowing for positive and 

25 negative selection of cells comprising said marker gene(s); 

growing said transformed cells in selective medium to 
positively select for marker gene containing cells in a 
first step and negatively select for marker gene containing 
cells in a second step; .and 

30 isolating cells comprising said subtle change in said 

gene. 

10. A method according to Claim 9, wherein said 
construct comprises a hprt gene. 
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11- A method according to Claim 9, including 
identifying said modified cells ^ wherein said identifying is 
by meems of monoclonal antibodies specific f r the 
polypeptide encoded by said homologous sequence. 

5 12. A method for introducing a change in a gene at a 

target locus in a chromosome of a viable mcimmalian cell, 
said method comprising; 

transforming said cell with a linear DNA construct 
comprising an n-targeting vector having (1) a sequence of at 
10 least 50 bp of homology with an indigenous region of said 
gene and differing from said indigenous region and (2) at 
least one marker gene allowing for positive selection of 
cells comprising said marker gene; 

growing said transformed cells in selective medium to 
15 positively select for marker gene containing cells; and 
isolating cells comprising said change in said gene. 

13. A method according to Claim 12, wherein said 
marker gene is antibiotic resistance. 

14. A method according to Claim 12, wherein said 
20 marker gene is 5' of said gene and flanked 5' by at least 50 

bp of homologous sequence. 

15. A linear targeting vector construct comprising a 
wild-type structural gene sequence of a gene commonly 
associated with a genetic disease as a result of a 

25 difference in sequence, said wild-type structtaral gene being 
homologous at the chromosomal locus of said difference, at 
least one marker gene for positive selection and having 
flanking homologous sequences to said marker gene, wherein 
the homologous sequences proximal to the termini of said 

30 vector construct are either distal at said chromosomal locus 
to define an 

n-targeting vector or proximal at said chromosomal 1 cus to 
define an 0-targeting vector. 
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16. A targeting vector according to Claim 15, wherein 
said at least one marker comprises a marker for positive 
selection and a marker for negative selection. 

17. A targeting vector according to Claim 15, wherein 
5 said vector is an n-targeting vector and comprises a marker 

gene at one terminus. 



18. A targeting vector according to Claim 15, wherein 
said gene is the jS-globin gene. 
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AMENDED CLAIMS 

[received by the International Bureau 
on 18 September 1992 (18,09,92); 
original daijns 1,5,9-11 and 14 amended; 
remaining claims vinchanged (3 pages)] 

5. A method according to Claim 1, wherein said 
vector is an 0-vector and including the additional step 
of growing said isolated cells for sufficient time for 
the indigenous region and markers of said teurget locus to 
be excised; and 

identifying cells comprising the homologous sequence 
and lacking said indigenous sequence, but retaining said 
subtle change, by means of a marker allowing for negative 
selection. 

6. A method according to Claim 1, wherein said 
mammsd-ian cells are embryonic cells. 

7. A method according to Claim 1, wherein at least 
one terminus of said lineeu: DNA construct comprises at 
least about 50 bp of homology with said indigenous 
region. 

8. A method according to Claim 1, wherein said 
target locus comprises a defective globin gene and said 
construct comprises a functional globin gene. 

9. A method for introducing a subtle change in a 
gene at a teurget locus in a chromosome of a viable 
mammalian cell, said method comprising: 

trsmsforming said cell with a linear DNA construct 
comprising an 0*teu:geting vector having (1) a sequence of 
at least 50 bp of homology with an indigenous region of 
said gene and differing from said indigenous region and 
(2) one or a combination of marker genes allowing for 
positive and negative selection of cells comprising said 
marker gene(s) ; 

using said meurker gene for selection by growing 
said transformed cells in selective medivim to positively 
select for marker gene containing cells in a first step 
and negativ ly select for marker gene containing cells in 
a second step; emd 
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Isolating cells comprising said subtle change in 
said gene. 

10. A method according to Claim 9, including 
identifying said cells comprising said subtly changed 
gene, wherein said identifying is by means of monoclonal 
antibodies specific for the polypeptide encoded by said 
homologous sequence. 

11. A method for introducing a subtle change in a 
gene at a target locus in a chromosome of a viable 
mammalian cell, said method comprising: 

transforming said cells with a linear DNA construct 
comprising em 0 -targeting vector having (1) a sequence of 
at least 50 bp of homology with an indigenous region of 
said gene and differing from said indigenous region and 
(2) at least one marker gene allowing for positive 
selection of cells comprising said marker gene; 

growing said transformed cells in selective medixim 
to positively select for marker gene containing cells; 
and 

isolating cells comprising said subtle change in 
said gene. 

12. A method according to Claim 11, wherein said 
marker gene is antibiotic resistance. 

13. A method according to Claim 11, wherein said 
marker gene is 5 » of said gene and flanked 5 ' by at least 
50 bp of homologous sequence. 

14. A linear targeting vector construct comprising 
a wild-type structural gene sequence of a gene commonly 
associated with a genetic disease as a result of a 
difference in sequence, said wild-type structural gene 
being homologous at the chromosomal locus of said 
difference, at least one marker gene for positive 
selection and having flanking homologous sequences to 
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said maurker gene, wherein the h mologous sequences 
proximal to the termini of said vector construct are 
either distal at said chrom somal locus to define an 0 - 
teurgeting vector or proximal at said chromosomal locus to 
define an 0-targeting vector. 

15. A targeting vector according to Claim 14, 
wherein at least one marker comprises a marker for 
positive selection and a marker for negative selection. 

16. A targeting vector according to Claim 14, 
wherein said vector in an Q -targeting vector and 
comprises a msirker gene at one terminus. 



17. A targeting vector according to Claim 14, 
wherein said tsorget gene is the p-globin gene. 
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